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Outline 


FPGA selection for flight missions 
Differentiating FPGAs 
Cost Analysis 
SEE Analysis 

Expanding Evaluation!^ 

« Limitations of Bit Error Rate 

* SET Performance Degradation Metric 

* Availability Calculation 

Applying Evaluation criteria to the selection 
process 
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Flight Project FPGA Select! 


IV ASA 


Primary Considerations 

* Criticality 

* Number of Mega-Operations Per Second (MOPS) 

* Internal clock frequency 

• Number of operations performed at each dock edge 

* Area/Power restraints 

* Cost 

Analysis 

* SEE and Reliability testing 

* Integrating traditional SEE metrics with obtainable 
MOPs 


Microeledron-cs Reliability & Quaiil'caiion Workshop fMRQW) Doc. 4-5. 2007. Manhattan Beach. CA 
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General FPGA Architecture 
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Configuration: A Major Difference 
between FPGA Classes 


FPGAs contain groups 
of preexisting logic: 
HARDWARE 


CONFIGURATION TYPES 


Configuration: 

» Arrangement of pre- 
existing logic 

* Defines Functionality 

* Defines Connectivity 


Antifuse SRAM- FLASH 
Based Based: 


Common types 

* One time configurable: 

* Re-configurable 


Pago G 


To be piesentod ai Microelectronics Reliability & Qualification Woikairop (MRQVV-, Doc 4-5. 200/’. Manhattan Beach CA 
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Antifuse FPGA Devices (Actel 
and Aeroflex) 

« Pros: 

* Most common FPGA devices utilized for space 
missions - Heritage 

* Configuration is fused (no transistors) and is thus 

“HARDEND” - not affect 

* Logic has embedded 
mitigation at each DFF 
(either TMR or DICE) - 
eases the design phase 

* Cons: 

* One time programmable 
- can complicate the 
design/debug phase 

* Very expensive 

To be presented at Miei electronic* Reliability & Q unification Workshop (MR GW) Dec 4-5. 3007 Manhattan Beach CA 


in hardened Actel devices 


Single Point of Failure 


SRAM-Based FPGA’s 

Pros: 

« The ability to reconfigure a function while in-flight is of 
great advantage to many missions 

» Device is Less expensive 

* Easier to debug/correct (with no mitigation) 

* Performance (MOPS): 

. At: 

w opeod 

* Increased User Device Resources 

Cons: 

* Configuration is SRAM-based - increased sensitivity to 
radiation (vs. antifuse) 

* Additional design complexity necessary for mitigation 

* Additional hardware necessary for (re Configuration 
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What Xilinx Does Well: Frequency 
and Number of Mega-Operations per 

SeCO n d . 1 \lockperiod 

NMOPS =fk - 1/f ■ 






Ily'SHfeill 


K: Resource 


and speed 
Dependent 





Xilinx Virtex Series can supply a high frequency (f) with 
a large K value. NMOPS is very large compared to many 
other FPGA manufacturers 


To foe pjesented at Micm&lectkjnics Reliability & Qualification Workshop (MRQVV). G 



Memory |jj : ’ ■■■ • \ 

■ Configuration 
Manager + 
Scrubber 


Configuration 

Manager 


Xilinx FPGAs in Space: 

Configuration and Scrubbing 

Minimal Requirements 




Full Reconfigure 

To increase availability: 
, use Scrubber 


can be 


Extra circuitry is 
required regardless in\ 
order to configure/re- 
configure 


P.iCK '0 
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Criticality and Xilinx: Proposed 
Solution: Full TMR 

Very Complicated to ¥ 


Triple the design 
within the Xilinx 
FPGA device 
(including I/O) 

User implemented 
(can lengthen 
design cycle) 

Will consume » 3x 
of original area 

Difficult to 
implement multiple 
clock domains 

Use an external 
FPGA device to 
scrub the 
configuration 
memory 


p^eto 1 ' 

Reliability & Qualification Workshop (MRQWK Dec 4-5 2007. Ml 
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Cost Analysis 


i 


* Missions do not generally require a large number of 
replicated FPGA devices 

Cost of a mission will not rely on FPGA device cost 
Design cycle can grossly affect cost: 

* Complexity of design architecture; 

* One FPGA can not handle required number of operations per 
second. 

* Chosen FPGA can not handle availability specifications - 
additional/complex mitigation is required. 

* Complexity of verification 

* Complexity of Board 

* Poor choice in emulation or engineering models 

* Choose the FPGA that best meets 
requirements! 

To be pi esenled at Miomelectionics Reliability & Qualification Workshop fMRQW). Disc 4-5. 200/. Manhattan Beach. CA 
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Determining Reliability and 
Availability: Radiation Testing and 
SEE Analysis 


NASA 
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Investigating Radiation Effects ^ 
(SEE Analysis) 


Determine Bit sensitivity 

* Flip Flops 

» Configuration (SRAM based technology) 

Availability analysis 

« Given a function to implement - what is th< 
percentage of time the output is correct vs 

• Determine an availability rating that consid 

» Operational Frequency 

* Fluence 

« Repair time 

* Burst time 
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What Function to Implement for 

Testing? 


ily not available at 


* No functional 
Masking 

♦ Easy to base-line 
across FFGAs 


* Can not cover a 
significant amount of 
state space while testing 

♦Usually have to start 
from scratch at every 
error event 


♦ Reduces Test . Minimal state 

^ me space coverage 

♦increases state {short test runs - 

space coverage reset u P on error ) 

* Only significant 
for specific design 

To be ptesentod <■« Microelectronics! Reliability & Ou^ificaKon Workshop ?MRQW) Dec 4-5. 200,' Manhattan Beach. CA 
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Calculating Error Cross Sections ***¥■ 

Traditional 

^ Events error 

llation i. TF: Total Effective 
Fluence 

. T.B: Time in Burst 

Events 


Fluence 


Error calculation:. 
Bursts within data 


TF - (TB * FLUX) 


To bo piO'-r-entc-d ai Microelectronics Reliability & Qualification Workshop {MRQW5 Doc, 4-5. 2007 Manhattan Beach CA 


Clock Frequency Effects 54Mev*cm 2 /m g 


Aeroflex 


0 decreases as Frequency 

increases 

'Musi significanl mik larger 
, 'Mid iiiiM : •; £ ;f f if . fliffi. MM. iMu . v 


Afetel 


CJ decreases 
decreases I I 
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. of event frequency 

ction fed to error rate calculator: based off of a cumulative 
on probability function (P(T>t)) 

>t analyzing how long we are in error 


0.00 Ei-00 


Architectures 
$ 4F4L 
M 0F4L 
A 4F8L 
0 0F0L 


Freqency MHz 


Actel reported 2 MHz data follows nearly 


Frequency (MHz) 
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nil 


Error Cross-Section Results 
Prove for Antifuse Devices.. 


Static testing is not sufficient 
Static simulation is not sufficient 

Assumptions of frequency response can not 
automatically be made 
- Actel produced expected (traditional) response 

* Aeroflex - unexpected. . combinatorial logic acts as 
transient filter 


To he pt evented at Mieioeiedronies Reliability & Qualification Workshop (MRQW) Dec 4 -ft. 200? Manhattan Beach. CA 



REAG Testing of Xilinx SRAM-Based 
FPGAs. 
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Scrubbing Facts: 

Most SRAM based FPGA faults are believed to occur in 
configuration memory 

Correction of fault can only be accomplished by: 

• Reconfiguration - can be costly (time wise) 

® Scrubbing 

Reconfiguration brings down the system 
While scrubbing, the system is fully operational. 

Scrubbing does not reduce the probability of an upset 
occurring 

Frequency of scrubbing can reduce the amount of time 
the upset is present in the configuration memory 

Unable to scrub everything 

Warning: High Current spikes observed by Xilinx 
consortium: 

• Observed @ fluence =1e08 (1e05 < flux < 1e06): FLUX is 
extremely accelerated for scrubbing mitigation technique 

• Readback+CRC is performed at every frame - different than 
blind scrubber of REAG 

• REAG did not observe event. . . tests performed with flux <1e03 

To be piosented of Microelpciionics. Reliability Qualification Workshop {MRQW'}. Dec 4-5 2007 Manhattan Beach CA 


Pa ge 21 


Non-TMR Windowed Architecture 

N levels of logic bgj^veen DFFS ... 2 strings each; N = 0, 8, and 20 


X IViHA 


REAG uses 
alternating data: 
inputs to 
achieve 

accurate cross-: 
sections 
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LE7< 5 Mot vet tested - REA G 


has not found on- 


Error Cross Section Calculation: ^ 

Dealing with Bursts 

Krr ‘ Cross-section based 

: u s : off of functional 

TFL - (TB * FLUX ) upsets {shift register) 

Simultaneous Multiple 

■■ m errors exist in shift 


(MRQW). Dec 4-5. 2007. Manhal 



Evaluation Criteria and Device 
Selection 
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Limitations with Error Cross Sections ** 
as sole Evaluation Criteria 

• Frequency Effect Analysis and Successful Operations 

per second; 

DUTA:@100MHz over 1E07 fluence: no bursts 10 errors 

DUTB;@ 50MHz over 1E07 fluence: no bursts 5 errors 

CTA = 2* CJB; Assumes constant error rate per frequency 

Common Interpretation: Cross Section increases with Frequency - 
Decrease Clock Rate for Critical Missions 

* However, B has to run twice as long as A to 
complete the same number of successful operations 

* Illustrates that per number of completed operations, 
each has the same probability to accumulate an 
equivalent number of errors 

In this ease:Slower Clock does not influence errors 
per successful operation 


To bo presented at Microelectronics Reliability & QuaWrcatiei 
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Limitations with Error Cross Sections 
as sole Evaluation Criteria 
(Continued) 

* Burst Analysis: 

• Cross section probability calculation is based off of 
Event frequency {not event duration). 

* Cross section does not consider burst or repair time 
(availability) 
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Bit Error Rate Misconceptions: 


Given a Bit Error rate of 5e-08. what does this mean??? 


AntiFuse 

* Bit Error Rate is based 
on DFFs 

* Number of DFFs will be 
from a few hundred to 
10’s of thousands 

* Comes out to about 
1 error every 10,000 
days or better 


SRAM 


Generally pertains to 
configuration bit rate 


If for example 1e7 bits 
can affect the design 
upon upset - then can 

have f upset every 2 

days 


To bs presented at Micioefe&tronics Reliability & Qualiiicaiian Workshop {MRQWJ. Dec 4-5. 2007 Manhattan Beach, CA 


SET Performance Metric: 


i Given a failure rate 
(worse-case is bit- 
error rate): MTTF 

« Determines required 
operational 
frequency and 
necessary 
parallelism 


NOP target : Targeted Number of operations 

F*k: operational frequency * implemented 
number of operations {each cycle) 

EC/ : Number of clock cycles of error per event i 

€yc rad : Total number of operational clock cycles 
during irradiation 

Acc: Acceleration Factor 




f*k 


NOPtarget 

MTTF 


. A 1.0 , !l EG x 

1 . 0 - *( 1 ) 

Acc / - 1 Cycmd 
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Availability Calculation using 
Radiation Data 

MTTF 


A = 1 is a 
perfect system 


MTTR + MTTF 

A: Steady State Availability 




3,8*1 0 6 *AccR 


(3.6*1 0 6 *AccR )/ 
{6.67HQ^3,6*10 6 *A€cR) 


RTAX 
@150 MHz 


(6.0*1 0 5 *AccA )/ 
(10~ a * 6,0*1 0 5 *AccA) 


Aerdfiex @ 
100MHz 


Xilinx @ 
100MHz 


41*AccX 


(41*AccX )/ 

( 1.6*10‘ 2 + 41 *AccX) 
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Mission Device Selection 


Xilinx showed a relatively low availability rating 
at 100MHz. 

* If used at full rate, will achieve much higher 
operations per second. 

» Higher MOPS can include scheduled downtime and 
may be a great fit 

Criticality and reliability play a major role in 
device selection 

* Missions have traditionally chosen antifuse devices 
for critical specifications. 

* Actel has been in the forefront 

* Aeroflex is very promising with its combinatorial transient 
filtering. 

* For less critical functionality, SRAM devices are 
being heavily investigated 
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Embedded vs. User Implemented 
T MR 


NASA 


W FLIP 

ion FLOPS 


<400 to <400 to 1000 
1000 


Discrete state 
space is 

_2# DFFs 

Add XTMR to Xilinx 

* ■ Observed area 

increase @ 5x 
and 6x 

* l/Q speed may be 
jeopardized 
(Simultaneously 
Switching 
Signals) 

:f Internal 

operational speed 
can be decreased 


<2000 to 
4000 


<2000 to 4000 


< 22.000 


<400 HO 
MHz 


Mot datasheet dock speeds . . , actual design dock speeds 
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Understand Requirements - 
Select Wisely 

If criticality (reliability and availability) is essential: 


Antifuse FPG As provide safer solutions 

Antifuse FPGAs can shorten the design cycle - 
Effective 




More Cost 


Verificatk 
be verifiei 


on is eased 
5d) 


ed (mitigation is embedded and does not have to 


* Board design is simplified - do not have to triple I/O {signal 
integrity requirements) 

* Multiple clock domains are easier to implement 

If MOPS is essential 

. 

* SRAM based design can ease the design cycle (without 
additional TMR) 

* Available IP cores 

* Re-programmability 

* Number of high speed available resources 

• SRAM based FPGA currently provide the fastest internal 
clocking (internal DLL + multiple embedded Power PCs) 


To bo presented at Micioelectror 


s Reliability & Qualification Workshop ( MRQW ), Dr;. 
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Summary ^ 

* Each FPGA type has its advantages: SEE analysis must 
take this into account for a comprehensive comparison 

Sensitivity calculations are provided to missions to 
assist in the selection process. 

* Test to determine additional mitigation schemes required per 
FPGA 

* Bit Error calculations 

* Availability and degradation analysis 

* Formulae have been presented: 

* Adjust Bit error calculations due to long bursts 

* SET Performance degradation Metric 

* Availability 

* Mission Cost and design cycle are directly related. 

* Keep designs simple 

* Each FPGA has its advantages 

* Choose the best fit FPGA for your mission specifications 
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